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Abstract 

Consider testing normality against a one-parameter family of univariate dis- 
tributions containing the normal distribution as the boundary, e.g., the family of 
^distributions or an infinitely divisible family with finite variance. We prove that 
under mild regularity conditions, the sample skewness is the locally best invariant 
(LBI) test of normality against a wide class of asymmetric families and the kurto- 
sis is the LBI test against symmetric families. We also discuss non-regular cases 
such as testing normality against the stable family and some related results in the 
multivariate cases. 

Keywords and phrases: generalized hyperbolic distribution, infinitely divisible distribu- 
tion, normal mixture, outlier detection, stable distribution. 



1 Introduction 

In 1935, E.S. Person remarked: 

". . . it seems likely that for large samples and when only small departures 
from normality are in question, the most efficient criteria will be based on the 
moment coefficients of the sample, e.g. on the values of \J~fa and fa." 
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Surprisingly this statement has never been formally proved, although there exists large 
literature on test i ng no rmality and sampling distributions of the skewness and the kur- 
tosis. See iThodel ([2002) for a comprehensive survey on tests of normality. The purpose 
of this paper is to give a proof of this statement for fixed sample size (n > 3) under gen- 
eral regularity conditions for a wide class of alternatives, including the normal mixture 
alternatives and the infinitely divisible alternatives with finite variance. Technically all 
the necessary ingredients are already given in the literature. Therefore the merit of this 
paper is to give a clear statement and a proof of this basic fact in a unified framework 
and also to consider some non-regular cases, in particular testing normality against the 
stable family. 

In fact "non-regular" may not be an appropriate term, because by considering contam- 
ination type alternatives, we see that there are functional degrees of freedom in construct- 
ing an alternative family and the locally best invariant test against the family. Therefore 
by "small departure" we are excluding contamination type departures from normality. 
See our discussion at the end of Section |21 

In this paper we are concerned with testing the null hypothesis that the true distribu- 
tion belongs to the normal location scale family, against the alternatives of other location 
scale families. We are mainly interested in invariant testing procedures with respect to 
the locati o n and the scale changes of the observations. In the context of outlier detection, 



Ferguson! ()196fl ) proved that the skewness and the kurtosis are the locally best invari- 
ant tests of normality for slippage type models of outliers. In Ferguson's setting, the 
proportion of outliers can be sub stantial b u t the the amount of slippage tends to zero. 
In establishing the LBI property, iFergusonl (|196ll ) derived the basic result (see Proposi- 
tion ^ below) on the likelihood ratio of the maximal invariant u nder the locati o n-scal e 

t r ansf ormat io n . The same result was given in Section II. 2. 2 of lHaiek fc Sidakl ( 1967th 

lUthofll (|l97dll973h used the result to derive the best invariant tests of normality against 
some specific alternatives. See also Section 3.2 of Haiek et al. (Il999l). A gener al result on 
the likelihood ratio of maximal invariant w as given in IWiismanl (Il967i J l990) and it led 
to some imp ortant results of Kariya et al. ( Kuwana &: Karival (|l99lh . iKariva fc Ge"o7^ 
f 1994 . Il995h ^ in the m ultivariate setting. 

In IFergusonl (|196ll )'s setting of outlier detection, if the number of outliers are dis- 
tributed according to the binomial distribution, the problem of outlier detection is logically 
equivalent to testing normality against mixture alternatives. Therefore the LBI property 
of the ske wness and the ku rtosis against mixture alternatives is a straightforward conse- 
quence of IFergusonl (|l96ll ). However Ferguson's result has not been interpreted in this 
manner. In this paper we establish the LBI property of the skewness and the kurtosis in 
a more general setting and treat the normal mixture model as an example. 

In testing multivariate normality, even if we restrict ourselves to invariant testing pro- 
cedures, there is no single LBI tes t, because the maximal invariant moments are multi- 
dimensional (e.g. Takemura ( 19931 )). Furthermore the invariance can be based on the full 
general linear group or the triangular group. This distinction leads to different results, 
because the invariance with respect to the triangular group preserves certain multivariate 
one-sided alternatives, whereas the general linear group does not. In Section El we dis- 
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cuss these points in a setting somewhat more general than considered by Kariya and his 
coauthors. 

The organizations of this paper is as follows. In Sectional we state our main theorem 
concerning the locally best invariant test of normality against one-sided alternatives. We 
also discuss Laplace approximation to the integral in LBI for large sample sizes n. In 
Section E] we show that our theorem applies in particular to the normal mixture family 
and the infinitely divisible family. In Section 0] as an important non-regular case we 
consider testing against the stable family. In Section El we compare locally best invariant 
test and tests based on profile likelihood. Finally in Section El we discuss generalizations 
of our main theorem to multivariate cases. 

2 Locally best invariant test of univariate normality 

Let 




-oo < a < oo, b > 0, 6 > 0, 



denote a one-parameter family of location-scale densities with the shape parameter 6. We 
simply write f(x;9) = f ^(x;9) for the standard case (a, b) = (0,1). We assume that 
9 = corresponds to the normal density 



f(x;0) = <f>(x) = 




Based on i.i.d. observations x\, . . . , x n from f a ,b(x; 9) we want to test the null hypothesis 
of normality: 

H :6 = vs. H 1 :9>0. (1) 

Here we are testing normality (9 = 0) against the one-sided alternatives (9 > 0). If 
we are concerned about heavier tail than the normal as the alternatives, this is a natural 
setting. However suppose that we are concerned about asymmetry and we do not know 
whether the distribution may be left-skewed or right-skewed under the alternatives. In 
this case we should test normality against two-sided alternatives and then is not a 
suitable formulation. In this paper for simplicity we only consider one-sided alternatives, 
thus avoiding the consideration of unbiasedness of tests. 

It should be noted that there exists an arbitrariness in choosing a standard member 
((a,b) = (0, 1)) from a location-scale family. For the normal family we usually choose the 
standard normal density <p(x) as the standard member. Note however that in Section 0] we 
take iV(0, 2) as the standard member in considering the stable alternatives for notational 
convenience. Given a particular choice of standard members f(x; 8), 9 > 0, we can choose 
another smooth set of standard members as 

fa(8 )m (x-,e) = -L/ fyfr- » (2) 
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where a(9),b(9) are smooth function of 9 and (a(0),6(0)) = (0,1). This arbitrariness 
does not matter if we use invariant testing procedures. However as in the case of normal 
mixture distributions in Section ETT1 it is sometimes convenient to resolve this ambiguity 
in an appropriate manner. Details on parametrization is discussed in Appendix |E] 

As mentioned above we are primarily interested in invariant testing procedures. A 
critical region R is invariant if 

(xi, . . . , x n ) 6 R (a + bxi, . . . , a + bx n ) G R, — oo < Va < oo, Wb > 0. 

Fix a particula r alternative 0i > 0. We state th e following basic result (Theorem b in 
Section II.2.2 of lHaiek fc sidakl (|l967h . Section 2 of lFergusonl (jlQGlh on the most powerful 
invariant test against 9\. 

Proposition 1. The critical region of the most powerful invariant test for testing Ho : 
9 = against H± : 9 = 9\ > is given by 



I °° IZUtJ(a + bx i; 0)b^dadb 



> k 



(3) 



for some k > 0. 



Note that the values ( be replaced by any maximal invariant of the 

location-scale transformation, since the ratio in Q is invariant. For our purposes it is 
most convenient to replace Xi, i = 1, . . . , n, by the standardized value 
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x 



(4) 



Then Y17=i Zi = ® an< ^ Ym=\ z f = n an d 

n 

Hf(a + bzi;0) 



(27T)"/ 1 



■ exp 



n(a 2 + b 2 ) 



Therefore, as in (26) of Section II.2.2 of lHaiek fc Sidakl (|l967jl . the denominator of © 
becomes the following constant: 



OO POO 



Y[f(a + bzi;0)b n - 2 dadb 



i=i 



r((n-l)/2) 

2n n/2 7r (n-l)/2- 



Since we are considering a fixed sample size n, this constant can be ignored in Q and 
the rejection region is written as 



POO POO n 

/ / Ylf(a + bz i ;9 1 )b n ~ 2 dadb> k' . 

J J —CO 



(5) 
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We now consider 9 — 9\ close to 0. For a while we proceed formally. Throughout this 
paper we assume that l[x\ 9) = \ogf(x; 9) is continuously differentiable with respect to 9 
including the boundary 9 = 0. Then 

l{x; 9) = l(x; 0) + l e {x; 0)9 + o(9), 

where 

l e (x;9) = -^\ogf(x;9) 

is the score function. Therefore 

f(x; 9) = f(x; 0) exp(l e (x; 0)9 + o(9)) = f(x; 0)(1 + l 6 (x; 0)9) + o(9) 

and 

n n n 

Y[f{a + b Zi ;9) = Y[f{a + bz i ;0){l + J2^ + bz l ;0)9) + o{9) 

i=l i=l i=l 

1 / n(a 2 + b 2 ) 



exp ( - Ha 2 + J ) + + bz % - 0)9) + o(9). 



(27T)-/' 

It follows that for small 9 = 9\ the rejection region (JSj) can be approximately written as 

n(a 2 + b 2 )' 



T[zi, . . . ,Zj 



/*oo /*oo / Z i 

) = / / V l e (a + b Zi ; 0) exp ( - Ha J ) 6 n " 2 darf6 > fc". (6) 



In order to justify the above derivation we assume the following convenient regularity 
condition. 

Assumption 1. For some e > 

/ / g(a, b; e) exp ( -jb dadb<oo, 

J J — oo 

where 

f h v \lf{a + bz-9)\ 

g n {a, b; e) = sup 



kl<V». o<e<e /(a + &2;0) 
Under this regularity condition we have the following theorem. 

Theorem 1. Under Assumption^ the unique rejection region of the locally best in- 
variant test of normality H : 9 = vs. Hi : 9 > is given by (GJ), provided that 
P (T(zi, . . . , z n ) = k") = under H . 

A straightforward proof is given in Appendix IA.lt Note that the statement of this 
theorem is slightly complicated by the requirement that P (T(zi, . . . , z n ) = k") = under 
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Hq. We need this requirement because if Pq(T(zi, . . . , z n ) = k") > 0, in order to maximize 
the local power we have to look at 0(9 2 ) terms in the expansion of f(x; 8) around 6 = 0. 

A particularly simple result is obtained when lg(x; 0) is a polynomial of degree k in x. 
In this case lg(a + bzf, 0) is a polynomial in a, b and Zj and lg(a + bzf, 0) is written as 

l e (a + bzu 0) = p (a, b)z* + pi(a, b)z\~ x + ■■■+ p k (a, b), (7) 

where po(a, b), . . . , Pk{a, b) are polynomials in a and b. Denote the standardized Z-th central 
moment by 

m 



s l n 



Then average of (JJJ) is written as 
1 n 

- l e (a + bzi] 0) = p (a, b)fn k H h Pks{a, b)rh 3 + p fc - 2 (a, 6) + p fc (a, 6). 



n . 
i=i 



Furthermore the integral J °° Pj(a, b) exp(— n(a 2 + b 2 ) / 2)b n ~~ 2 dadb can be explicitly 
evaluated. See Appendix O In particular if lg(x;0) is a third degree polynomial, then 
(JSJ) is equivalent the standardized sample skewness of the observations. Now consider the 
case that lg(x; 0) is a fourth degree polynomial without odd degree terms. Then 



"a 2m expf-— 



>3C 



implies that j^ OQ Pk-3(0',b)da = in (|7|). Therefore © is equivalent the standardized 
sample kurtosis. We now have the following corollary. 

Corollary 1. Assume the same regularity condition as in Theorem^ If the score 
function lg(x;0) is a third degree polynomial in x, then the locally best invariant test of 
normality is given by the standardized sample skewness. If l$(x;0) is a fourth degree 
polynomial in x without odd degree terms, then the locally best invariant test of normality 
is given by the standardized sample kurtosis. 

In the next section we show that in two important cases, lg(x;0) is a third degree 
polynomial for asymmetric alternatives and is a fourth degree polynomial in x without 
odd degree terms for symmetric alternatives. 

For general score function the integral © may not be easy to evaluate. Although 
in this paper we are considering fixed n, we here discuss Laplace approximation to the 
integral © for large n. Let A denote a random variable having the distribution N(0, 1/n) 
and let B denote the random variable such that B/y/n has the x-distribution with n — 
1 degrees of freedom. Then as n ^ oo, (A, B) converges to (0, 1) in distribution (or 
equivalently in probability). Note that except for the normalizing constant, the integral 
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in © can be written as E[Y™ =1 lg(A + Bzi',0)]. Under mild regularity conditions, for 
large n, this expectation is simply approximated by putting (A, B) = (0, 1): 



n 

f(z 1 ,...,z n ) = )Jg(Zj;0) 
1=1 



It is ea sily shown that this is in fact the Laplace approximation fe.g. lBleistein &: Handelsman 



( 19861 )) to the integral in ©. We call T approximate LBI for testing normality Under 



mild regularity conditions, the approximate LBI and the LBI should be asymptotically 
equivalent. 



In Appendix A.l of iKuriki fc Takemural 1)20011 ) it is shown that the test based on 



the k-th standardized sample cumulant is asymptotically equivalent to the test based on 
Y^i=i Hk{zi), where is the fc-th Hermite polynomial. We see that the fc-th standardized 
sample cumulant is characterized as an approximate LBI for the case that the score 
function is given by H k (x). See a further discussion in Section |5J When n is not too large, 
we may consider evaluating E[Y^ =1 lg(A + Bzi\ 0)] by numerical integration or by Monte 
Carlo sampling. 

For the rest of this section we make several remarks on the above results. In the 
location-scale transformation x\ \— > a + bx{ we might allow b ^ to be negative. The 
maximal invariant is z = (zi, . . . , z n )' with z identified with —z, or more compactly it is 
zz'. Then an invariant critical region can not depend on a sign preserving function ip of z 
(i.e. ip(—z) = —ip(z)). In particular it can not depend on the skewness m 3 itself, although 
it can depend on |m 3 |. In the univariate case, allowing b < is somewhat unnatural and 
we have so far only considered b > 0. However in the multivariate case the invariance with 
respect to the full general linear group corresponds to allowing b < in the univariate 
case. We discuss this point further in Section |H1 

Let g(x) be a probability density. By an e-contamination alternative we mean a density 
of the form 

f(x; e) = (1 - e)<p(x) + eg(x) = <p(x) + e{g{x) - <f>(x)). 
Letting 9 = e, we see 

Therefore as long as g(x) = </>(x)(l + lg(x; 0)) is a probability density, we can construct 
a one-parameter contamination family of alternatives such that T(zi,...,z n ) in (El) is 



the LB I with this score function l$(x; 0). By "small departures from normality" IPearson 



(1935) probably did not have a contamination alternative in mind. In our setting the 



sample size n is fixed. If e is much smaller than 1/n, we actually have no observation 
from g(x) with probability close to 1. In this sense a contamination family seems to 
possess certain non-regularity as a family containing the normal distribution. 
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3 Normal mixture family and infinitely divisible fam- 
ily of distributions 

In this section we discuss two general classes of alternatives such that the score function 
at the normal distribution is a polynomial and Corollary^ is applicable. The first is the 
normal mixture family and the second is the infinitely divisible family with finite variance. 



3.1 Normal mixture family 

Suppose that the mean /x and the variance a 2 of the normal distribution N(fx, a 2 ) has the 
prior distribution g(n, a 2 ; 9), 9 > 0, such that g degenerates to the point mass at (0, 1) 
as 9 — > 0. For simplicity write r = 1/a 2 — 1. Then as 9 — > 0, both /i and r converge to 
in distribution. The marginal density is given by 

r°° r°° l (x - u) 2 

f{x;0)= / / -— eX p(-(r + l) 1 )h(fi,r; 9)dfidr, 

J -i J-oo V 2,11 2 

where h(fi, r; 6) = (r + l) 2 g(/i, 1/(1 + r); 6). Consider the expansion 

exp(-(r + l) ) = exp(-y)exp(-(r + l)y)exp((r + l)x^-Ty) 

= exp(-y) exp(-(r + 1)^-) (l + ((r + l)xfi - Ty) 

+ i((r+l)x/i-ry) 2 + •••). (9) 

2 

The term exp(— (r + 1)^-) can be absorbed into h(fx,r; 9) and can be ignored. Also the 
constant term (i.e. terms not involving x) in the expansion can be ignored. Now from 
(J36|) of Appendix [B] it follows that without loss of generality we can choose the prior 
distribution in such a way that the expected values of the coefficients of x and x 2 vanish. 
Therefore in (jUJ) we only need to consider the cubic or higher degree terms in x in the 
expansion. Relevant terms on the right-hand side of are 

x 2 1 111 1 

ex P(" y) [ " 2^ rx3 + + ~ l^ Tx4 + ■ ( 10 ) 

If only the scale parameter is mixed, i.e. if \x = 0, then the dominant term is (l/8)r 2 x 4 . 
The primary example of this case is the family of t-distributions with m = 1/9 degrees of 
freedom, where the mixing distribution for the scale is the inverse Gamma distribution. 
From the above consideration it follows that the LBI test against the t-family is given by 
the standardized sample kurtosis. On the other hand if only the location parameter is 
mixed, i.e. r = and Eg^fi 3 ) ^ 0, then the LBI test is given by the standardized sample 
skewness. 

More interesting case is that fi and r is of the same order and the LBI test involves 
both skewness and kurtosis simultaneously. This happens in a limiting case of "normal 
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variance-mean mixture." In the normal variance-mean mixture, X given Y — y is normal 
with mean a + by, b 7^ 0, and variance y: 

X\Y = y ~ N(a + by,y), Y ~ g(y,6). 

Now assume that K degenerates to a constant as 6* — > 0. Since we are considering location- 
scale invariant tests, we can assume that Y — > 1 in distribution and a = —b. Writing 
fi = b(y — 1) we have 

r=i-l = — 4- = -^ + (H), or /i = -6r + (|r|). (11) 

Therefore \x and r become proportional as 9 —>■ 0. In the following subsection we look at 
the generalized hyperbolic distribution as an example of this case. 



3.1.1 The case of the generalized hyperbolic distribution 



Gener alized hyperbolic distribution (GH distribution) was introduced bv lBarndorff-Nielsen 
[ 1977^1. Detailed explanations inclu d ing appli c ations of GH distributi ons are given in 



Barndorff-Nielsen fc Shephardl (|200lh . lEberleiril l|20Mh or lMasuda! (j2002h . From lEberleTn 
(2001) the density is written 



as 



f G n{x] A, a, ft, 8, //) 

= a(A, a, /?, 5) (5 2 + (x - /x) 2 ) (A "^ )/2 iT A _ i (aV5 2 + (a; - /i) 2 ) exp {(3{x - fj)) 



(12) 



where 



a(A, a, /3, 5) 



/2~7ra A ~ * <5\*r A (5 ^/a 2 - /3 2 ) 
is the normalizing constant and K\ is the modified Bessel function of the third kind with 



index A: 



K x (z) 



.A-l 



ex p ( ~2 Z {v + y l )) d yi z> °- 



The parameter space is given by 



— oo < ji, A < oo, a > 

with the additional boundaries {5 = 0, A > 0} and {a = \/3\, A < 0}. 

GH distribution can be characterized as a normal variance-mean mixture using the 
generalized inverse Gaussian distributions (GIG distributions) as the mixing distribution. 
Let X | Y = y be distributed as N(fjL + /3y,y) and let Y have the generalized inverse 
Gaussian distribution with parameters A, 5, and 7 = \/ a 2 — (3 2 . The density of Y is 
written as 



fciciy, A, 5, 7) 



7 



5) 2K X (S 7 ) 



y x 1 exp 1 - 



5 2 



+ i 2 y 



y>o, 



(13) 



where the parameter space is given by 7, 5 > 0, —00 < A < 00, with the additional 
boundaries {5 = 0, A > 0} and {7 = 0, A < 0}. 

In (fTSj) let 5 — > 00 and 7 —>■ 00 such that 7/5 — > c, then it is easily seen that Y 
degenerates to c. Therefore GH distribution converges to N(fi + (3c, c) as 5 — > 00 and 
7 — ► 00 such that 7/5 — > c. As above we can assume c = 1 and /i = — /3 without loss of 
generality. We also assume that (3 is fixed. For simplicity let 5 = 7. Then (fT3j) is written 
as 

Wv;A.t) = ^y^xp (-y Q + v)) ■ 

Note that this density has exponentially small tails at y = and y = 00. Therefore term 
by term integration in Q is justified. 

By ([11)1. the main term in (jlOj) is simply given as 

eX p(-^)(^ X 3 + I x 4^ 

PV 2 n 2 8 ; 
It follows that the rejection region of the LBI test (for a fixed (3) is given by 

n n 

i=l i=l 

where 

poo o(/-l)/2 / I 1 

c, = jf I 'e-»^ = i_ r( [±i ) . (14) 

We see that the LBI test involves both the skewness and the kurtosis simultaneously and 
the weight depends on the value of (3. 



3.2 Infinitely divisible family 

Here we consider an infinitely divisible family with finite variance. The characteristic 
function of an infinitely divisible random variable X with mean and variance 1 can be 
written as 

f°° 1 
<f>(t)=exp[ {e ltu - 1-itu)— n(du)}, (15) 



u 



where the Levy measure \x can be taken as a probability measure. Here we assume that 
X possesses moments up to an appropriate order. Since moments of the Levy measure 
fi are the cumulants of X, existence of moments of X up to an appropriate order is 
equivalent to the existence of moments of /x to the same order. For example if Y has the 
exponential distribution, the characteristic f unction of X = Y — 1 can be written as (|15p 
wift ,4*,) = ue-^u > 0, (Example 8.10 ofQ (|l999j )) and for the double-exponential 



distribution with variance 1, fx(du) = \u\e~ y/ ^ u , — oo < u < oo. 

Now we introduce the time parameter m = 1/9 and consider a Levy process X(m), 
where X = X(l) has the characteristic function (|15|) . Furthermore we standardize the 
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variance as X(m) / \fm. Then by the central limit theorem X(m) / ^fm converges to N(0, 1) 
asm-* oo. The characteristic function of X{m)j^fm is written as 



m(e i«WiS_i_ ) -n{du)}. (16) 

Recalling the fact |e J:r — (1 + + (ix) 2 /2 + ■ • • + (zx) fc )/A;| < |a;| fc+1 /(A; + 1)! for all real x, 
we can expand the integrand in f)16|) as 

m{e iut/^ _ l _i* )= * + W u + + o(1/m) 



-^/m 2 6-ym 24m 
up to an appropriate order and integrate it term by term. Then 



m (t) = exp ( - | + ^=(^) 3 + ^(^) 4 ) (! + o(V"*)), (17) 



where = J^u^ fi(du) is the j-th cumulant of X. Note that (|17|) is formally the same 
as the usual Edgeworth expansion of the cumulant generating function of m i.i.d. random 
variables. By considering a Levy process, we can allow m to be fractional and we have a 
family of distributions {X(m)/y/m} indexed by the continuous parameter m = 1/6. By 
the usual Edgeworth expansion, the density function of X(m) / ' \fm is given as 



/(*; 1/m) = -±=e*l\\ + ^=H 3 (x) + ^~H A {x) + ^-H t {x)) + o(l/m), 
y/2'K by'm 24m 72m 

where Hj(x) is the j-th Hermite polynomial. We now see that i) if K3 7^ then the LBI 
test is given by the sample skewness and ii) if K3 = and K4 7^ then the LBI test is 
given by the standardized sample kurtosis. 

As examples consider the centered exponential distribution and the double-exponential 
distribution discussed at the beginning of this section. In the former case we test normality 
against the family of normalized Gamma distributions and the LBI test is given by the 
standardized sample skewness. In the latter case, the characteristic function of X{m) / ^fm 
is given by 

«<> = I 1 - d 

This is a dual family of distributions to t-family in the sense of iDreier fc Kot3 (|2002h . 
The LBI test against this family is given by the sample kurtosis, as in the case of t-family. 



4 Testing against the stable family 

In this section as an important non-regular case we consider testing against the stable 
family. The characteristic function of a general stable distribution (a 7^ 1) is given by 

$(t) = <&(t;n,(T,a,P) = exp (-|<rt| a jl + if3(sgnt) tan (\cxt\ l ~ a - 1) j + ifxtj , 
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where /i is the location, a is the scale, (3 is the "skewness" and a is the characteristic 
exponent. The parameter space is given by 

-oo < 11 < oo, a > 0, < a < 2, \(3\ < 1. 

For the standard case (/x, cr) = (0, 1) we simply write the characteristic function as 

$(t; a, /3) = exp {l + «7?(sgnt) tan (™) (It) 1 " - 1)}) . (18) 

This is Zolotarev's (M) parameterization (see p. 11 of lZolotarevI l|l98fih ). The correspond- 
ing density is written as g{x\ fi, a, a, (3) and g(x; a, (3) in the standard case. 

Letting a = 2 in (fTHj) we obtain iV(0, 2). For convenience let = 2 — a, /i = a, a = 6 
and we write 

f(x;9)=g(x;a,b,2-9,(3), 

where f(x; 0) corresponds to iV(0,2). For this section we take N(0,2) as the standard 
member of the normal location-scale family. In the following we fix /3 and for each (3 we 
consider LBI for Hq : 9 = vs Hi : 9 > 0. This is similar to the case of generalized 
hyperbolic distributions. In particular for (3 = we are testing normality against the 
symmetric stable family, which is important in practice. 

It can be shown that we can differentiate g(x;a,/3) = e~ ztx $(t;a, [3)dt under 

the integral sign and the score function is written as 



1 f°° 8 

l e (x;0) = -— / e- ttx — $(t;2, (3)dt. (19) 



In particular for (3 = 



1 f°° 2 

l e (x;0) = — / cos(tx) log \t\ t 2 e~ l dt. 
2vr 



The non-regularit y of stable fam ily lies in the fact that this score function has a very 
heavy tail. In fact in Matsui ( 2005h it is shown that for large \x\ 



2 

/,(x;0) = o(exp(^) 



1-3 



Thus under iV(0, 2), E[lg(x; 0)] = exists but E[lg(x; 0) 2 ] = oo diverges. T his corre s ponds 
to the fact that as a j 2, the Fisher information I aa diverges to infinity. Matsui f 2005^ 1 
gives a detailed analysis of the Fisher information matrix for the general stable distribution 
close to the normal distribution. 

Although Assumption [T] does not hold for this case and we have to give a separate 
proof, the following theorem holds. 

Theorem 2. In the general stable family consider testing H : a = 2 vs. Hi : a < 2 for 
fixed (3. Then the locally best invariant is given by (0), where the score function is given 
in {23). 

The proof of this theorem is very technical and it is given in Appendix I A. 21 Note that 
score function puts extremely heavy weights to outlying observations and this test can be 
considered as an outlier detection test. This is intuitively reasonable, because the stable 
distribution with a < 2 does not possess a finite variance. 



12 



5 Tests based on the profile likelihood 



f{x; 0)=*exp(- {J ^JP) {1 + Oh (^) + o(6)}, (20) 



In this section we consider tests based on the profile likelihood, where the location and the 
scale parameters are estimated by the maximum likelihood. We show that the LBI test 
and the test based on the profile likelihood are different in general except for the case that 
the score function is a third degree polynomial or a fourth degree polynomial without odd 
degree terms. Our argument in this section is formal and we implicitly assume enough 
regularity conditions so that our formal argument is justified. 
Consider a density close to a normal distribution of the form 

1 ( (x-fi) 2 \ fl , au( x-n* 

/27T(T 

where h is some smooth function. 

We estimate \x and a by the maximum likelihood under the null and under the al- 
ternative and take the ratio of the maximized likelihoods. 9 is considered to be fixed in 
the estimation. Since the maximum likelihood estimator is location-scale equivariant, we 
obtain an invariant testing procedure. Under the null hypothesis of normal distribution 
the maximum likelihood estimates are fi = x and a 2 = s 2 . Under the alternative, an 
approximation to fi and a 2 to the order of 0(9) is easily derived as 

1 n 1 n 

f l = x-9s-Y,ti(z i ) + o(9), a 2 = s 2 (l-9-Y,Ziti(zi)) + o(9). (21) 
i=i i=i 

Let L(fi, a 2 ) denote the log-likelihood under the alternative and let L(x, s 2 ) denote the 
log-likelihood under the null. Then substituting (}2*Tj) into (I20J) we obtain 



1 n 1 1 n 6 1 n 

L(/2, a 2 ) = -- V(x 4 - x) 2 -(l + 9-J2^)- n lo S s + tt ~ Zih '^) + °( e ) 

i=l i=l i=l 

9 — 1 n 
= L(x, s 2 ) - ^- z i h '&) + 



n 

i=i 



Hence the test based on the profile likelihood ratio has the rejection region 

n 

J2*iti(zi)>k. (22) 

i=l 

On the other hand, as discussed in Section |21 for large n the Laplace approximation to 
the integral in (JBJ) implies that the LBI is asymptotically equivalent to 

n 

^h(zi)>k'. (23) 

i=i 

We see that and (|2^jl are generally different even asymptotically. It should be noted 
that if h is a third degree polynomial or a fourth degree polynomial without odd degree 
terms, then both the profile likelihood procedure and the LBI procedure reduce to the 
sample skewness and the sample kurtosis. 
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6 Multivariate extensions 

In this section we consider multivariate extensions of o ur resul t s. A c omprehensive survey 
on invariant tests of multivariate normality is given in Henze ( 2002h . 
For a column vector a G W and apxp nonsingular matrix B, let 

faA^0) = T^r^f(B- 1 (x-a);9), x G R p , (24) 
| det B | 

be a one-parameter family with the shape parameter 9. As in the univariate case, we 
assume that 

/(^0) = ^^exp(-||x|| 2 /2), 

where || • || denotes the standard Euclidean norm in W. Based on the i.i.d. samples 
X\ , . . . , x n from f a ,B{x] 6), we discuss invariant testing procedures for testing the normality 

H :9 = vs. H 1 :9>0. 

Write X — (xi, . . . , x n )' G IR nxp . Consider the group 

W x GL(p) = {(a, B) | a G W, Be W xp , det B ^ 0} 

endowed with the product (a 1; Bi) ■ (a 2 , B 2 ) = {B 2 ai + a 2 , B 2 Bi). This group acts on the 
sample space W nxp as 

(a, B)X = l n a' + XB\ (a, B) eW x GL(p), (25) 

where l n = (1, . . . , 1)' G M. n . For each 9 fixed, the action ()25j) induces the transitive 
action on the parameter space. In other words, the model ()24|) is a transformation model 
with the parameter (a, B). Thus, it is natural to consider invariant procedures under the 
action (|2*K|) . 

Let LT(p) be the set of pxp lower triangular matrices with positive diagonal elements. 
Let x = Yli=i x i/ n an d S = YH=\_{ x i ~~ %)( x i ~ x)' /n be the sample mean vector and the 
sample covariance matrix. Let T G LT(p) be the Cholesky root of S so that S = TT' . 
Let 

Z=(z 1 ,...,z n y = (X-l n x')(T)- 1 (26) 

(zi = T _1 (xj — x), i = 1, . . . ,n). It is easy to see that a maximal invariant under the 
action (|2ljj) is 

W = ZZ' = {X - l n x)S-\X - l n x')', 

and we can choose a cross section Z = Z{X) = (zi, . . . , z n )' G R nxp as a unique decom- 
position of W = ZZ' in some appropriate way. Note that Z = ZQ', or Zi = Qzi, for some 
pxp orthogonal matrix Q. The following is a multivariate extension of Proposition ^ 



14 



Proposition 2. Under the group action of MP x GL(p), the critical region of the most 
powerful invariant test for testing H : 9 = against Hi : 9 = 9\ > is given by 

J GL{p) J WP Utif^ + Bz^O)] det B\^dadB > 1 j 

for some k > 0, where da = nf=i <^ a « anc ^ ^ = HT j=i dby are ^ e Lebesgue measures of 
M p and M pxp , respectively. 

Proof. The Jacobian of the transformation X i— > (a, S)X = l n a' + XB' is (det j?) n . The 
left invariant measure of W x GL(p) is (det B)- {p+1 ^dadB. From Theorem 4 of lWiismanl 
(119671 ). the critical region is 

Igl(p) L /(<* + ggj gOI det B\^dadB ^ 
JGL^LUtif^ + Bzi-MdetBl-P^dadB > ' 

which is equivalent to (|27|) . □ 

Next consider the subgroup 

M p x LT(p) = {(o,T) | a G F, T G LT(p)} 

of M p x GL(p). This also acts on the sample space M" xp with the same action (|25|) with 
GL(p) replaced by LT(p). For this group, the induced action on the parameter (a,B) in 
the model (|2"l*j) is not transitive anymore. However, when we consider a subclass of (j24j) 
that 

/ a>B M) = — ^(p" 1 ^-*)!! 2 ;*) (28) 



1 



y/det(BB' 

(h is a function), the action on the parameter (a, BB') is transitive, and invariant testing 
procedures under the group MP x LT(p) may be more appropriate in some cases. 

For the action of M p x LT(p), Z in is a maximal invariant, and we can use Z itself 
as a cross section. 

The most powerful invariant test under the action of MP x LT(p) is given as follows. 

Proposition 3. Under the group action ofM p x LT(p), the critical region of the most 
powerful invariant test for testing H : 9 = against Hi : 9 = 9\ > is given by 



j LT{P) nr = i /(« + tz« 9i)da uti tr - 1 
/lt W /*. nr=i /(« + r* ; o)da nti nr 1 ^ 



>A;' 



/or some A;' > 0, where T = (ty) £ LT{p) 



15 



Proof. The Jacobian of the transformation X \— > (a,T)X 

nr=i ^ii- The left invariant meas ure of W x LT(p ) is da f] 
sition follows from Theorem 4 of Wiisman ( 19671 ). 



1 L ii 



l n a' + XT' is (detT) n = 

Ili>j dtij- The propo- 
□ 



From Propositions 121 and H3 under similar conditions to Assumption ^ the LBI test 
can be derived by integrating the score function Ym=\ ^ei.o 1 + Bz i ; 0) with respect to (a, B). 
In the rest of this section, we examine a particular case where 

n n n 

n f( a + Bzf, 0) = n ^ a + ^ °){ 1 + e n ^ a + Bz ^ °) + 

8=1 i=l i=l 

with 

£ e (x;0) = PolMI 4 + Pi|M| 2 +P2- 
This holds, for example, when f a ,B{x', 9) is of the form of (|28j) with 



r((p + Q/2) (1+ r(p+0/2 , Q) 

(9 = o) 



(multivariate t distribution with 9 1 degrees of freedom). We restrict our attention to the 
case po > for simplicity. 



Assumption 2. 

(i) 



f(a + Bz;0) 
(ii) For some e > 0, 



QQf( aJr -B Z l L /\ <> ii |i| || |2 , . <>, 

= PoFll +P2 [Po > 0). 



/ / g(a,B;e) n exp{ ||a|| 2 - -tr(B'B))\ det B^'^dadB < oo, 

Jgl(p) Jrp 2 2 



IGL(p) 

where 

g{a,B;e)= sup 



II* 



|<i,o<e< £ f(a + Bz;0) 



Theorem 3. Under Assumption^ the rejection region of the LBI test for testing nor- 
mality H : 9 = vs. Hi : 9 = 9\ > under the action ofM. p x GL{p) is given by 

n 

Eini 4>A; - 

1=1 
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The rejection region of the LBI test under the action of MP x LT(p) is given by 



(n + p + 2)(n + p) ^ INI 4 _ 2 (n + P + 2 ) ^2 max 0> fc ) z « f; 



2 2 

ij i ti 

i=l i=l j,k=l 

n p n p 



i=l j,k=l i=l j=l 

where the jth element of Zi. 

The lemma below is used in proving TheoremEl This is easil y proved by some standard 
Jacobian formulas in the multivariate analysis (e.g., page 86 of iMuirhead (1982)). 

Lemma 1. Let Sym(p) denote the set of p x p real symmetric matrices. Define a map 
(p : MP xp — > Sym(p) by (p(B) = nB'B. Then, for any measurable set D C Sym(p), 

[ exp(--ti(B'B))\detB\ n - p - 1 dB oc / exp(--ti C)(detC)^ n - p ~ 2) dC, 
J<p(B)eD 2 J D 2 

where C = (c^) 6 Sym{p) and dC = Yli>j dcij- 

Proof of Theorem^ Note first that Y^i=i ll a +-S^i|| 2 = n\\a\\ 2 +ntr(B' B) because YH=\ z i = 
and Y17=i z i z i = n ^v The secon d and the third terms of £g are irrelevant to Zj's. 

In the case of W p x GL(p), the rejection region is of the form Y^=i^( z i) > ^> where 

f f n n 

I(z) = / ||a + J B2|| 4 exp( ||a|| 2 - -tr(B'B))\ det B^^dadB. 

Jgl(p) Jrp 2 2 

By Lemma Q the integral of a function of B'B can be replaced by taking expectation 
with respect to the Wishart distribution nB'B ~ W p (n — 1,I P ). On the other hand, 
the integration with respect to a is regarded as the expectation with respect to \fria ~ 
N p (0, Ip). Note that for the Wishart matrix C ~ W p (n — 1, I p ), it holds that 

E[z'Cz] = (n- l)\\z\\ 2 , E[(z'Cz) 2 } = (n - l)(n + l)||z|| 4 . 

By taking expectations of 

|| a + J B2|| 4 = (||a|| 2 + 2a'Bz + z'B'Bz) 2 
= (z'B'Bz) 2 + 2\\a\\ 2 (z'B'Bz) 

+(terms of odd degrees in a) + (a term independent of z), 

we see that J2i=i H z i) * s proportional to Y^=i INI 4 + const. 

In the case of W J x LT(p), the rejection region is of the form YH=\ I( z i) > k', where 

I(z)= [ [ lla + TzfeM—M'-^T'T^dafltr-'li^ 



i=l i>j 



The integration with respect to T is reduced to taking expectations nt\ ~ Xn-i-i an d 
\fntij ~ N(0, 1) (i > j). The details are given in Appendix [D] □ 
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A Proofs 



A.l Proof of Theorem [T] 

By the mean value theorem 



8 



f(x;9) = f(x;0) + 9—f(x;9*), 
where < 9* = 9*(x) < 9. Then 

Y[f(a + bz t ;9) = Y[(f(a + bz t ;0) + 9—f(a + bz i ;9*)) 

i=l ^ ' 



i=l 



where 



1=2 l<i\<—<h<n 



f(a + bz h ;Q) f(a + bz k ;0) 



By Assumption^ 



\R\ exp 



J-oo 



n(a 2 + b 2 ^ 



b n 2 dadb < oo. 



(29) 



Furthermore by the continuous differentiability of f(x; 9) with respect to 9 and the dom- 
inated convergence theorem we have 



J-oo 



^ f(a + b Zi ;0) 



exp 



n(a 2 + b 2 ^ 



b n - 2 dadb 



poo poo { 2 i T-,2 \ 

-> / y2k(a + z i ;0)exp(- n[a } )b n - 2 dadb (9^0). 
Jo J-oo i=l V 2 / 

Now the theo rem follows by the stan dard argument on the locally most powerful test (e.g. 
Section 4.8 of ICox fc Hinklevl (|l974h . □ 



A. 2 Proof of Theorem [2] 

In the proof, M > denotes some suitable constant. Since Assumption 1 is not applicable 
for Theorem |2l we have to prove the finiteness of ()29|) by a separate argument. It suffices 
to prove that for each subsequence 1 < h < ■ • • < ii < n 



foo poo 



'0 J-oo 



8_ 
89 



f(a + bz h ;9*) 



8_ 

89 



f(a + bz H ;9*: 



dadb < oo. 
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Without loss of generality consider %\ — 1 
Wi 



%i = I and write 



^-f(a + bz 1 ;6*)--~f(a + bz l ;6* 



exp 



k=l+l J 



For evaluations of W\ we need the following property of the score function of general stable 
distributions. It follows from Lemma 3.1 of iMatsiii ( 20051 ). 



Lemma 2. For a — 2 — 9 ^ 1, \(d/d9)f(x; 9)\ is bounded and uniformly continuous in 
x. Furthermore as 9 = 2 — a J. 0, there exist M > 0, xq > 0, such that 



d_ 

09 



f(x;9) 



< M ■ |xr~ 3 loe:|x 



V|x| > x . 



The integrability of Wi for I < n — 1 follows from that of W n , since exp(— l/4x 2 ) < 
M ■ \d/d9f(x; 9*) \ from Lemma El However, the integrability of W n needs a very detailed 
argument. We replace a by r = a + bzi, then W n becomes 



W n (r,b) = U 



k=l 



^-f(r + b(z k - Zl y,9* 



W.-2 



(30) 



Note that z k — z\ ^ implies 

3c> s.t. Wk^l \c(z k - zi)\ > 2, 



b > cx > c\r\ =^ \r + b(zj — Zi)\ > x. 
Now we divide the integral of (jSU)) into three parts 



(31) 



+ 

'\r\<X()Jo J \t\>xq J b<c\r\ J \r\>xo J b>c\r\ 

Using Lemma El and (JHlJ) in h, we have 
h < I I W n (r,b)drdb 

J |r|<x'o J b<cxQ 



W n (r, b)drdb = h + I 2 + h 



+M 



|r|<xo J b>cxo 



log |r + b(z k - zi)| 
m t X [ \r + b(z k - Zl )\3-e* 



n-l 



b n ~ 2 drdb < oo 



< oo. 

For I2 the following lemma is useful. 
Lemma 3. Suppose that {z k / : fc 6 n, z k 7^ Zj} are given. Then 



W n (r,b) 



n 



d_ 

09 



fXr + b(z k -z 1 );9*) 



■b 



n-2 



(32) 



is bounded in —00 < r < 00 and b > 0. 
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Proof. Assume that (j52T) is not bounded. Choose a sequence of (r, b) such that 
diverges to oo. Since the terms in the absolute value on the right-hand side are bounded, 
b has to go to oo. By the assumption we can choose r such that for some k, 

\r + b{z k - Zl )\ <db\ 

where d > is a constant and < 7 < 1 (otherwise ()32j) converges to as b | 00 from 
Lemma |2J). Then for k 7^ I we have 



|r + b(z k - z x ) - {r + b{z { - Zi)}\ = b\z k - z t \ 



Hence as b | 00 



\r + b(zi - z x )\ > x . (33) 
Furthermore, since dW < b\z\ — z\\ for sufficiently large b, the triangular equality gives 



Vb 



Vb(zi - 



> 



t 00, as b | 00. 



Therefore, from Lemma El and as 6 f 00 the left-hand side of (J221) approaches 



5 



/(r + 6(z fc -^) ; r) 



36 

<M-n 



n 

log |r + 6(^ - zi)\ 



log |r + fe(z; - Zi)\ 
\r + b(zi - z^l 3 ' 9 * 

1 



• b 



n-2 



\r + b{z l -z l )\ 1 - e * \ r /Vb + y/b(zi -z x )\ 2 

regardless of selection of k. This is a contradiction and the proof is over 
By Lemma 01 we get 



I 0, 



□ 



I 2 < sup <{ W n (r, b)^ 
d 







< M 



|r|>a;o 



89 



06 
f(r;9* 



f(r;9* 



\r\>xo J b<c\r\ 

■ 2c\r\dr < 00. 



d_ 

00 



f(r;9*] 



dbdr 



Finally for I 3 from Lemma 121 and (|31|) , 



h < M 



logr 

I />. 1 3 — 0* 



b>c\r\ 



max 

k 



log |r + b(z k - zi) 



\r\>xo 

Since for large x > 0, (logx) 71 " 1 < x, we have 

rt-l 



r + b(z k - zi)\ 3 - 6 



n-l 



b n ~ 2 dbdr. 



\og\r + b(z k - zi) 



'b>c\r\ \ \r + b(z k - Zi) 



13-6 



b n ~ 2 db < M ■ 



\r + b{z k - zJl-^-^-^b^db. 



b>c\r\ 
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The ri ght-hand side is bounded by the equation 2.111, 2 on p. 67 of iGradshtevn &: Rvzhik 
( 200nh : 

x l x l na 1 f x 1 ^ 1 

— (fx = — / clx 

z 1 ? z^\l + l-m)V (l + l-m)b'J z™ ' 

where z\ — a' + b'x and a', V are constants. By induction we obtain 
-dx 



(a' + b'x) m (m - I - 1) (a' + b'x) m - l b' 

^ 1(1 - 1) ■■■(/ + 1 - k)a' k x l - k 

~ (m - I - 1) • • • (m - I - 1 + k)(a' + Vxf^W^-' 

Letting a' = r, V = (z k — zi), m = \_(n — 1)(3 — 9*)\ — 1, I — n — 2, in the equation above 
and utilizing Lemma |2l we have 



/ ( 

Jb>c\r\ \ 



^\r + b(z k -z 1 )\y-\ n _ 2db<M _ n-1 



lrl \\r + b(z k -z 1 )\ 3 ' e, J ~ | r |L(«-i)(3-0)j-n- 

Since the right-hand side is integrable with respect to r > Xq, we have -Z3 < 00. This 
completes the proof. 



B Question of parametrization 

Here we briefly discuss how to choose a{6) and b(9) in (J2J). Write l(x;9) = log f(x;9). 
Under the assumption that the 3x3 Fisher information matrix exists at (a(9), b(9), 9), it is 
convenient to determine (a' (9), b'{9)) in such a way that (d/d6)l a {e),b{e){x\ 9) is orthogonal 
to the location-scale family in the sense of Fisher information, i.e. 

d d 

^0*a(0),6(0)(z; 9)—l a ( 9 )Xe){x; Q)fa(8),b(d)( X 'i ®) dx = ( 34 ) 

d d 

-Tjj l a(0),b(e)(x; 8)-7^L(9),b(d)( X 'i ®)fa(8),b(e)( X '' d ) dx = ( 35 ) 

These give a system of differential equations for (a(9),b(9)). 

Actually we are only concerned in the neighborhood of the normal distribution and 
we only consider determining (a'(0), 1/(0)). At 9 = 0, l(x; 0) = — (1/2) log(27r) — x 2 /2. 
Therefore 

d d 

^o,i(x; 9) = - J- + b'(0)x 2 + a'(0) + l e (x; 0) 
and flHU), (^3) reduce to 

/ (-7777^ + b '(°) x2 + a '(°) x + l e{x\ 0))x k (f>(x)dx = 0, k = 1,2, 
J V(Q) 
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which can be solved for a'(0) and b'(0). 

Note that we do not necessarily have to explicitly solve for a'(0) and 6'(0). Instead 
for theoretical developments we can use the fact that the standard member f(x;8) can 
be chosen in such a way that 

J l e (x;0)x k (f)(x)dx = 0, k = 1,2. (36) 

When lg(x;0) is a polynomial in x, ()36|) shows that we can choose lg(x;0) such that it 
is cubic or of higher degree in x. This is enough for simplifying our treatment of mixing 
distribution in Section I3~T1 



C Details in the case of polynomial score function 

Here we write out coefficients of LBI in the case of polynomial score function (cf. (J7J)). 
Suppose that Iq{x; 0) is given as 

k 

l e (x; 0) = c Q x k + cxx k ~ l H h c k = c k _jX j . 

3=0 

Then 

n n k k j , ,\ 

^2 h(a + bzf, 0) = ^ c k-j( a + bz iY = n Ck ~3 ^ \i) al V~ l ™<j-i- 

i=l i=l j=0 j=0 «=0 ^ 7 

Using (|14j) for even I we have 



a'y'expf J -)b n - 2 dadb = -———F( ) x T( ). 



For odd / the integral is zero. Also we only consider j — I > 3. Hence the LBI test statistic 
is given as 

j=3 v 7 «=0, Z:cven v 7 

D Moments of z'T'Tz 

Let T = (ty) e LT(p) be a random matrix whose diagonal and lower off-diagonal elements 
are independently distributed as tu ~ Xm+ P -i, £jj ~ ^(0,1) (i > j), where m > is a 
constant. Let z = (z l5 . . . , z p )' 6 R p be a constant vector. In this section we evaluate the 
expectations 

Rp{z) = E[z'T'Tz}, S p (z) = E[{z'T'Tzf] 
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required in proving Theorem El 

Write z 2 = (zi) 2 <i< p , hi = (t il ) 2 < i < p and T 22 
sented as block matrices 



(t 



ij)2<i,j<V 



Then z and T are repre- 



T 



tn 
hi T 22 



Note that 



z'T'Tz 



(z 1 ,z' 2 ) 



tn t' 21 
T> 2 



tn 

t21 T22 



z i(tn + t' 21 hi) + 2zit 21 T 2 2Z2 + z 2 T 22 T 22 z 2 . 



By taking the expectation with respect to t\ x ~ Xm+p-n ^21 ~ -Wp-i(0> ^p-i)> we have 



z 2 (m + p — 1 + p — 1) + -R p -i(2 2 ) 

z\{m + 2p - 2) + R p -i(z 2 , ...,Zp) 
v 

V}z?(m + 2p- 2z). 



Also, 



[z'T'Tzf 



Noting that 



2\21 



5 p (z) 



{^1(^11 + t' 2 ihi) + 2^it 21 T 22 z 2 + 2 2 T 22 T 22 2; 2 } 2 
z i (^11 + ^2i^2i) 2 + 4z 2 (t 21 T 22 z 2 ) 2 + (2 2 T2 2 T 22 z 2 ) 2 
+4^i(^ii + ^21^21)^21^22-22 + 2z 2 (t 2 x + ^21^21)^2^22^22^2 
-\-Azi t 21 T 22 Z 2 z 2 T 22 ^22^2- 

z/(z/ + 2), we have 

4(m + 2p - 2)(m + 2p) + &zlRp-i(z 2 ) + 5 p _i(z 2 ) 
+2^ 2 (m + 2p-2)R p _ x (z 2 ) 

z\{m + 2p - 2)(m + 2p) + 2z\{m + 2p)R p ^(z 2 ) + S p -i{z 2 ) 
( Z! \ 

(z 1 , . . . , z p )A p ■ , 

\4J 



where 





( 


(m + 2p) (m + 2p 


-2) 


* ^ 






(m + 2p) (m + 2p 


-4) 




A p — 




(m + 2p) (m + 2p 


-6) 


A p -i 
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(m + 2p)m 
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(m + 2p m + 2p 
m + 2p m + 2p — 2 



m + 2p \ 
m + 2p - 2 







/m + 2p - 2 m + 2p - 4 
m + 2p - 4 m + 2p - 4 



V 



m 



m 



\m + 2p m + 2p — 2 m + 2 J 

= [m + 2p + 2 - 2min(i, j)] [m + 2p - 2 max(i, j)]i<m<p 
= [(m + 2p + 2)(m + 2p) - 2(m + 2p + 2) max(i, j) - 2(m + 2p) min(i, j) 
+4 max(i, j) min^jj')]!^-^,. 

Here denotes the elementwise multiplication of matrices. This means 

p p 
S p {z) = (m + 2p + 2)(m + 2p)C£2z*) 2 -2(m + 2p + 2)^2m&x(i,j)z*z 



2„2 



i=l 



»J=1 



-2(m + 2p) ^ min(i, j>?z? + 4(^^^ 2 ) ; 



i=i 



m 
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